Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add some more tests #33

Open
wants to merge 11 commits into
base: rdbms-sink
Choose a base branch
from
Open

Add some more tests #33

wants to merge 11 commits into from

Conversation

evansiroky
Copy link
Contributor

Add tests for the VanillaExtractServer and PostgresSQL.

Also, calculate code coverage and send results to codecov.io.

The statement to set the representative position of a way was having
weird issues on travis, so I refactored it.
Also, I renamed the a column called “nodes” to “node_ids” to avoid
calling a table and column the same thing.
@codecov-io
Copy link

codecov-io commented Sep 15, 2017

Codecov Report

❗ No coverage uploaded for pull request base (rdbms-sink@baa67c6). Click here to learn what that means.
The diff coverage is 38.46%.

Impacted file tree graph

@@              Coverage Diff              @@
##             rdbms-sink      #33   +/-   ##
=============================================
  Coverage              ?   51.94%           
  Complexity            ?      277           
=============================================
  Files                 ?       37           
  Lines                 ?     2058           
  Branches              ?      233           
=============================================
  Hits                  ?     1069           
  Misses                ?      910           
  Partials              ?       79
Impacted Files Coverage Δ Complexity Δ
...in/java/com/conveyal/osmlib/PostgresOSMSource.java 77.27% <100%> (ø) 10 <2> (?)
...rc/main/java/com/conveyal/osmlib/PostgresSink.java 77.51% <100%> (ø) 16 <0> (?)
.../main/java/com/conveyal/osmlib/VanillaExtract.java 61.76% <20%> (ø) 3 <1> (?)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update baa67c6...9b2abff. Read the comment docs.

@evansiroky evansiroky requested a review from abyrd September 15, 2017 17:53
Copy link
Member

@abyrd abyrd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good as a first step toward expanding testing of these libraries.

@@ -24,6 +29,9 @@ script: |
mvn clean verify --settings maven-settings.xml
fi

after_success:
- bash <(curl -s https://codecov.io/bash)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is running arbitrary code off a remote URL. This raises a red flag for me, especially on a repo that contains signing keys. Can we copy the script into our repo so we know what we're running?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The script at the url is pretty large and may change over time. I think this request is a bit excessive because that would also mean that we should examine and put every single maven dependency in our repo because that code could also echo environment variables or send it to a remote server.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True enough but Maven artifacts are signed. This is taking arbitrary code from a URL with no verification, and anyone can stick a proxy in between you and the server. It's always debatable how paranoid one needs to be, but the recent trend in piping the contents of URLs into bash is really a security hole.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that it's also unlikely we'd ever know if someone got the keys, which then compromises the signing system of Maven. The security problem is contagious.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah... but it's an https URL so there's a certificate for a particular domain involved. Maybe it's not so insecure then...

pom.xml Outdated
@@ -135,6 +135,25 @@
</execution>
</executions>
</plugin>
<plugin>
<groupId>org.jacoco</groupId>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add XML comments to the POM briefly explaining what this is. I know all the other dependencies don't have this but we should start adding it. It makes it less painful to browse through in the future.

@@ -246,7 +244,10 @@ public void writeEnd() throws IOException {
// statement.execute("create index on nodes(lat, lon)");
LOG.info("Assigning representative coordinates to ways...");
statement.execute("alter table ways add column rep_lat float(9), add column rep_lon float(9)");
statement.execute("update ways set (rep_lat, rep_lon) = (select lat, lon from nodes where nodes.node_id = nodes[array_length(nodes, 1)/2])");
statement.execute("UPDATE ways " +
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally I don't like the peculiar SQL convention of capitalizing keywords, especially since it's not checked by the parser so doesn't add any semantic safety. It mostly just achieves a cyber 1980s look. It's of course not going to directly hurt anything to have some statements using one convention and some using another, but just for consistency I'd rather do all lower case. This is a pretty superficial comment though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, I'll change it to lowercase

@@ -71,16 +66,14 @@ public static void main(String[] args) {
try {
httpServer.start();
LOG.info("VEX server running.");
Thread.currentThread().join();
// updateThread.interrupt();
//Thread.currentThread().join();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did you take out the thread join and the server shutdown? It will probably work without it but it's more clean to properly shut down.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did this because it was causing the VanillaExtractTest to hang when it tried to start the server. The test would not continue after the server started up.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should be changing the flow of our main method in order to make a test easier to write. We need to establish clearly what we're losing by not doing the join and clean shutdown -- this doesn't seem like the kind of thing that should just be changed without understanding the consequences.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not a java expert so I don't understand what the thread joining does.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thread.currentThread().join(); is kind of strange. It just blocks the current thread forever, waiting for itself to end. It only makes sense because this is intended to be run as a command line main method. Once the server thread starts, we don't want the main thread to shut down and end the whole process. This is another reason that if we want to test this logic, it should be factored out and called slightly differently from a test and from a main method.

.travis.yml Outdated
@@ -8,6 +8,11 @@ jdk:
before_install:
- openssl aes-256-cbc -K $encrypted_3ec7d78ab93d_key -iv $encrypted_3ec7d78ab93d_iv -in maven-artifact-signing-key.asc.enc -out maven-artifact-signing-key.asc -d
- gpg --import --batch maven-artifact-signing-key.asc
- psql -U postgres -c 'CREATE DATABASE osm_lib_test;'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably add comments to this file explaining why all these lines are here. Not just the database setup for the tests but the encrypted variables etc.

import static org.hamcrest.MatcherAssert.assertThat;
import static org.hamcrest.Matchers.is;

public class PostgresTest {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a matter of policy all new classes and methods should have comments. This should contain a human-readable explanation of what is being tested.

}
}

class TableTestCase {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice clear data types.

};

// perform file to postgres load
PostgresSink.main(args);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find it kind of strange to be passing string arguments into a main method in a test. Main methods are really only intended to be used as a command line interface. Ideally we'd pull the "load a feed" functionality out of the main method and call the same functionality from the main and the test. But this will do for now.

import static org.hamcrest.Matchers.*;

public class VanillaExtractTest {
static final String TEST_FILE = "./src/test/resources/porto_portugal.osm.pbf";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resources are associated with a package (in this case com.conveyal.osmlib) so it should be possible to put them in the appropriate place (/src/test/resources/com/conveyal/osmlib/porto_portugal.osm.pbf) then open them as a stream without giving full paths. That would require us to have load methods that accepted something other than a string path though.


// assert that the response is not an error response
assertThat(response, not(containsString("URI format")));

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should also attempt to decode the data that comes back as a PBF file and make sure it contains anything meaningful.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added some more logic that tests this.

connection = DriverManager.getConnection(pgUrl);
} catch (SQLException e) {
e.printStackTrace();
LOG.error("Error creating new database!");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This log message does not correspond to the purpose of the method.

connection.prepareStatement(statement).execute();
} catch (SQLException e) {
e.printStackTrace();
LOG.error("Error creating new database!");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This log message does not correspond to the purpose of the method.

* An MD5 can be represented as 32 hex digits so we don't save much length, but we do make it entirely alphabetical.
* log base 2 of 26 is about 4.7, so each character represents about 4.7 bits of randomness.
*
* The question remains of whether we need globally unique IDs or just application-unique IDs. The downside of
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code appears to be copied out of gtfs-lib. Since these are separate libraries I guess we have no choice but to duplicate the code. But the javadoc comments contain references to "feeds" and "older databases" which only make sense in the gtfs-lib context.

I'm also wondering why we need random database names - weren't you doing it before with a single predetermined name?

Copy link
Contributor Author

@evansiroky evansiroky Sep 19, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was using a single name, but I want to avoid that because I want to run each test on it's own database so that tests won't depend on each other. I've thought of a different idea for creating unique IDs and will implement that instead.

@abyrd
Copy link
Member

abyrd commented Sep 21, 2017

Other than the concerns about curl | bash (for which I don't think there's a much better solution) this looks pretty good. Should we merge and continue to add more tests later?

@evansiroky
Copy link
Contributor Author

@abyrd This PR also looks like it is awaiting your approval.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants